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ORIGINAL ARTICLE 

Genome-wide study of association and interaction with maternal 
cytomegalovirus infection suggests new schizophrenia loci 

AD Borglum 1 ' 2 ' 3 , D Demontis 1 ' 3 , J Grove 1 ' 3 ' 4 , J Pallesen 1 ' 3 , MV Hollegaard 5 , CB Pedersen 3 ' 6 , A Hedemand 1 ' 3 , M Matt hei sen 7 ' 8 ' 9 
GROUP investigators 10 , A Uitterlinden 11 , M Nyegaard 1 ' 3 , T 0rntoft 12 , C Wiuf 4 ' 13 , M Didriksen 14 , M Nordentoft 3 ' 15 , Mlvl Nothen 7 ' 16 ' 17 , 
M Rietschel 18 , RA Ophoff 19 , S Cichon 7 ' 16 ' 20 , RH Yolken 21 , DM Hougaard 5 , PB Mortensen 3 ' 6 and O Mors 2 ' 3 

Genetic and environmental components as well as their interaction contribute to the risk of schizophrenia, making it highly relevant 
to include environmental factors in genetic studies of schizophrenia. This study comprises genome-wide association (GWA) and 
follow-up analyses of all individuals born in Denmark since 1981 and diagnosed with schizophrenia as well as controls from the 
same birth cohort. Furthermore, we present the first genome-wide interaction survey of single nucleotide polymorphisms (SNPs) 
and maternal cytomegalovirus (CMV) infection. The GWA analysis included 888 cases and 882 controls, and the follow-up 
investigation of the top GWA results was performed in independent Danish (1396 cases and 1803 controls) and German-Dutch 
(1169 cases, 3714 controls) samples. The SNPs most strongly associated in the single-marker analysis of the combined Danish 
samples were rs4757144 in ARNTL (P=3.78 x 10~ 6 ) and rs8057927 in CDH13 (P= 1.39 x 10~ 5 ). Both genes have previously been 
linked to schizophrenia or other psychiatric disorders. The strongest associated SNP in the combined analysis, including Danish and 
German-Dutch samples, was rs1 292231 7 in RUNDC2A (P = 9.04 x 10~ 7 ). A region-based analysis summarizing independent signals 
in segments of 100kb identified a new region-based genome-wide significant locus overlapping the gene ZEB1 {P = 7.0 x 10" 7 ). 
This signal was replicated in the follow-up analysis (P = 2.3 x 10 -2 ). Significant interaction with maternal CMV infection was found 
for rs7902091 (P SNP x C mv = 7.3 x 10~ 7 ) in CTNNA3, a gene not previously implicated in schizophrenia, stressing the importance of 
including environmental factors in genetic studies. 

Molecular Psychiatry (2014) 19, 325-333; doi:10.1038/mp.2013.2; published online 29 January 2013 
Keywords: CTNNA3; gene-environment interaction; GWAS; GWIS; region-wise analysis; ZEB1 



INTRODUCTION 

Schizophrenia is a severe life-long mental disorder, which affects 
approximately 1% of the population worldwide. Several studies 
have documented a strong genetic component in the etiology 
and the heritability is estimated to be around 80%? Besides the 
genetic component, environmental factors as well as gene- 
environment interactions are believed to contribute to the disease 
risk. 2 Numerous linkage, candidate gene studies and genome- 
wide association (GWA) studies have been performed in order to 
elucidate the genetic architecture of the disease. 3-8 These studies 
have implicated several genes in disease risk but seldom 
unambiguously across different studies and populations. In the 
GWA studies, only a few loci have passed the generally accepted 
level of P<5 x 10~ 8 for genome-wide significance. 4 ' 5,9 " 11 From 
the GWA studies, it can be concluded that only moderate levels of 
association of common variants with schizophrenia can be 
expected, and recent results suggest that a high number of 



common susceptibility variants of small effect are involved, 
collectively capturing around 30% of the genetic risk. 12,13 The 
remaining genetic risk could involve de novo mutations, rare 
variants and gene-environment interactions. 14-18 Due to the low- 
effect sizes of the common risk variants, the heritability that can 
be accounted for by those identified so far has been estimated to 
be between 1% and 2%. 19 ' 20 

It is well documented that the environment has an important 
role in the development of schizophrenia. 21 Especially early in life 
the susceptibility to environmental risk factors may be increased, 
supported by several studies demonstrating an association of 
maternal infection with increased risk of the child developing 
schizophrenia later in life 22 " 25 It has also been reported that 
interaction between genetic variation in the offspring and markers 
of maternal infection (maternal antibodies) may influence the risk 
of schizophrenia, 26 stressing the importance of taking environ- 
mental factors into account in genetic studies. 
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The Danish population is, in general, considered ethnically 
homogenous with only recent immigration of non-Caucasian 
individuals, which makes it suitable for genetic studies. In 
Denmark, all newborn babies (around 65,000 each year) are 
screened for metabolic diseases and since 1981 the surplus of the 
analyzed blood spot samples have been stored in the Danish 
Newborn Screening Biobank (DNSB). 27 Coupling with information 
from the Danish Psychiatric Central Research Register 28 allows for 
the unique opportunity to obtain DNA from all individuals who 
have been diagnosed with schizophrenia since 1981. Furthermore, 
as the DNSB samples are obtained from babies before they are 
able to produce their own antibodies and hence the antibodies in 
the blood reflect the mother's antibodies, it is possible to 
investigate how maternal infections and their interactions with 
genetic variations in the offspring influence the risk of 
schizophrenia. Infection with cytomegalovirus (CMV), a neuro- 
trophic virus of the herpesvirus family, has been associated with 
schizophrenia in several studies, and interactions between 
selected genes and CMV have been reported, 29,30 

Here we report the results of a GWA study and follow-up 
investigation of all Danish individuals born since 1981 and 
diagnosed with schizophrenia, including single variant as well as 
regional analyses. In addition, as the first genome-wide gene- 
environment interaction study in schizophrenia, we examine 
interaction between single nucleotide polymorphisms (SNPs) and 
maternal CMV infection (maternal anti-CMV immunoglobulin G 
(IgG) antibody titer). 



MATERIALS AND METHODS 

Study design and power calculation 

A two-stage design was applied in this study. In stage 1, a GWA analysis of 
888 cases and 882 controls was performed and, in stage 2, a follow-up 
analysis of the strongest associated SNPs was performed on an 
independent sample consisting of 1396 cases and 1803 controls. Com- 
bined association analysis of both samples achieves a power of 80% to 
detect a disease allele with a frequency of 0.36 and odds ratio (OR) of 1.35 
assuming prevalence of 0.01 at a significance level of 5 x 1 0 ~ 8 . 31 The SNPs 
strongest associated with schizophrenia were analyzed further by 
combining stage 1 and stage 2 individuals with a German-Dutch sample, 
a sample genetically closely related to the Danish. 32 



Study subjects and phenotype definition 

Stage 1: It was possible to identify the samples of interest based on the 
unique personal identification number (CPR-number), which is assigned to 
all live-born babies in Denmark. This number is stored in the Danish Civil 
Registration System (DCRS) 33 and is used in all the contacts with the public 
sector. In this study, information from the DCRS was linked with the 
information stored in the nationwide Danish Psychiatric Central Register 28 
in order to identify all individuals born in 1981 and onwards that in 2006 
had been diagnosed with schizophrenia according to ICD-10-DCR (The 
ICD-10 Classification of Mental and Behavioural Disorders Diagnostic 
Criteria for Research; F20). For each schizophrenia case, one matched 
control individual was randomly selected with the same gender, date of 
birth and age and with no history of schizophrenia on the date of first 
diagnosis of schizophrenia of the case. Using this procedure, 91 5 cases and 
915 controls were identified and subsequently dried blood spots from the 
individuals were obtained from the DNSB. 

Stage 2: All individuals born since 1981 and onwards diagnosed with 
schizophrenia according to ICD-10-DCR, F20 between 2006 and 2010 and 
matched controls were identified as described above. In all, 1 149 cases and 
1 303 controls were identified and subsequently blood spots were obtained 
from the DNSB. Furthermore, a sample of 247 schizophrenia cases fulfilling 
ICD-10 criteria and 500 controls were included as described previously. 7,26 
The cases and controls were Danish Caucasians. 

German-Dutch replication sample: A total of 1169 schizophrenia cases 
(464 German and 705 Dutch) and 3714 ethnically matched controls (1272 
German and 2442 Dutch) were used in the replication analysis (details on 
criteria for inclusion can be found in Rietschel et al. 5 ). Descriptive data for 
the three samples can be found in Supplementary Table S1 . This study has 



been approved by the Danish Data Protection Agency and the local ethics 
committees in Denmark and abroad. 

Genotyping and quality control (QC) 

Stage 1: Sufficient biological material was available for 909 cases and 899 
controls. DNA was extracted from the dried blood spots using Extract-N- 
Amp Blood PCR kit (Sigma Aldrich, Seelze, Germany) and subsequently 
whole genome-amplified in triplicates using the RepliG kit (Qiagen, Venlo, 
The Netherlands) 34 The three separate reactions were pooled before 
genotyping, which was done using the lllumina Human 610-quad 
beadchip (San Diego, CA, USA). In all, 1774 individuals (892 cases, 882 
controls) with gender in concordance with the register information were 
successfully genotyped with a call rate >0.97. Stringent QC was applied to 
data from samples with a call rate >0.97. The QC excluded SNPs with a call 
rate <0.99, SNPs with a deviation from Hardy-Weinberg equilibrium 
(P< 0.0001 in controls) and a minor allele frequency (MAF) < 0.001 5. 
Furthermore, test for relatedness, estimation of individual heterozygosity 
and test for non-random missingness of SNPs between cases and controls 
were conducted (Supplementary Table S2). After QC, 1770 individuals (882 
controls, 888 cases) and 541,148 SNPs were left for further analysis. 

Stage 2: DNA from the 1 149 cases and 1303 controls obtained from the 
DNSB was extracted and whole genome-amplified using the kits described 
above. DNA was isolated from blood samples from the additional 247 
cases and 500 controls following standard procedures. In all, 193 follow-up 
SNPs (Supplementary Table S3) were genotyped as well as five SNPs on the 
sex-chromosomes, using the Sequenom MassARRAY genotyping platform 
(Sequenom, San Diego, CA, USA) following the protocol described in 
Nyegaard et al. 7 It was checked that the estimated gender, based on the 
genotypic information, was in concordance with the gender given in the 
DCRS. In order to exclude the presence of identical samples, an identity by 
state analysis was performed using the software Graphical Relationship 
Representation. 35 After QC, 3142 individuals (1370 cases, 1772 controls) 
with a call rate >0.8 were genotyped for 168 SNPs (Supplementary Table 
S3). The SNPs had a call rate >0.9, no significant deviation from Hardy- 
Weinberg equilibrium (P> 0.0001 in controls) and a MAF > 0.001 5. 

German-Dutch replication sample: The individuals were genotyped 
using the lllumina HumanHap550v3 BeadArray (lllumina). After QC 
genotypes for 475,427 SNPs were available in 1169 cases and 3714 
controls (for details on genotyping and QC, see Rietschel et al. 5 ). 

Antibody measurements 

Measurements of type-specific IgG antibodies to CMV were obtained by 
enzyme immunoassay 36 for a subsample of stage 1 individuals (488 cases 
(216 females and 272 males) and equally many controls. The blood spots 
stored in the DNSB were taken when the neonates were 2-7 days old. At 
that age a child has not yet produced any significant amount of IgG 
antibodies, but while in utero maternal IgG antibodies are transferred 
across the placenta to the fetus. Hence, the antibodies measured can be 
assumed to be mainly maternal 37 The measurements were dichotomized 
at 0.2 optical density units, yielding a prevalence consistent with those 
measured in European populations 38 

Statistical analysis 

All analyses were performed using the software PLINK (http://pngu.mgh. 
harvard.edu/~purcell/plink/) 39 unless otherwise stated. 

Population stratification 

In order to minimize the effect of spurious association originating from 
population, stratification association analysis was performed using logistic 
regression with principal component one as covariate derived from 
principal component analysis 40 (Supplementary Figure S2). This lowered 
the genomic inflation factor from 1.047 to 1.013. A deviation of X from 1 
can be expected under polygenic inheritance even when there is no 
population structure, 41 so in order to avoid unnecessary correction only 
principal component one was used to correct for population stratification 
(more information is provided in the Supplementary Information). 

Single-marker association analysis 

GWA analysis was performed using logistic regression with principal 
component one as a covariate applying an additive genetic model. The 
SNPs demonstrating the strongest association with schizophrenia in the 
GWA analysis (2843 SNPs with a P-value < 0.005), were further evaluated in 
a meta-analysis by including data from a German-Dutch replication 
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sample. 5 The meta-analysis implemented in PLINK was used, and a fixed 
effect model was considered. 

Two types of analysis were applied in order to identify follow-up SNPs 
for genotyping in the stage 2 sample: (1) based on the GWA analysis, a set 
of 100 SNPs were identified using a top-down approach, and (2) based on 
the meta-analysis, a set of 100 SNPs were identified using a top-down 
approach. Association with schizophrenia was analyzed in the stage 2 
sample by logistic regression using an additive genetic model. A binomial 
sign test was performed in order to test for evidence of directionally 
consistent replication. 

A meta-analysis (as described above) was used to test for association of 
follow-up SNPs with schizophrenia in two combined data sets: (1) the 
combined stage 1 and the stage 2 samples, and (2) the combined stage 1, 
stage 2 and German-Dutch replication sample (referred to as extended 
meta-analysis from now), which in total included 3453 cases and 6399 
controls. 

Region-wise association analysis 

All chromosomes were divided into overlapping regions of lOOkb, each 
overlapping its neighboring regions by 50 kb. For each region, a combined 
P-value was calculated by Fisher's method: 42 X= —2Yj=\ log e (P/)/ where 
k is the number of SNPs in the region and p, is the P-value for each 
SNP calculated by a standard j 1 test, using an additive model. The 
P-value for each region was calculated by permutation test shuffling 
the case-control status, (see supplementary Material for details). 

SNP x maternal CMV infection interaction analysis 
The two-step method of Murcray et al. 43 was applied in order to test 
whether genetic variation interacts with maternal CMV infection 
influencing the risk of schizophrenia in the offspring. First, the full set of 
SNPs was screened for association with maternal CMV infection in the 
combined sample of cases and controls at a significance level of 0.05. 
Second conditional logistic regression with inclusion of an interaction term 
in the regression on the m SNPs selected in step 1, was performed using 
Stata 10.0, College Station, TX, USA: StataCorp LP. Because the tests 
performed in steps 1 and 2 are (asymptotically) independent, 43 Bonferroni 
correction for the m tests in step 2 preserves the family-wise error rate. 



RESULTS 

Single-marker association analysis 

In the GWA analysis 26,863 SNPs demonstrated an association 
with schizophrenia with a P-value <0.05 (Figure 1a). In all, 54 SNPs 
showed P-values <1 x 10~ 4 , and the SNP demonstrating the 
strongest association with schizophrenia was rs2836518 (P=1.32 
x 10~ 5 ), located on chromosome 21q22 in the intron of ERG 
(list of all SNPs with P-values <1 x 10~ 4 can be found in 
Supplementary Table S5). 

In the stage 2 sample, 165 SNPs were successfully genotyped, of 
which nine demonstrated a directionally consistent association at 
nominal significance (Table 1). At the experiment level, the 
associations of the SNPs genotyped in stage 2 demonstrated 
significant directional consistent evidence of replication (binomial 
sign test P< 0.0096). The SNP showing the strongest association in 
the stage 2 sample was rs4757144 (P = 0.0059) located on 
chromosome 11 in the intron of ARNTL on chromosome 1 1 pi 5. 

In the combined analysis of stage 1 and stage 2, four SNPs 
showed P-values <1x10 -4 . The SNP demonstrating the 
strongest association in this analysis was also the ARNTL SNP 
rs4757144 (P = 3.78 x 10~ 6 ; Figure 1b). The other three markers 
were rs8057927 located in the intron of CDH13 on chromosome 
16q23 (P=1.39x 10~ 5 ; Figure 1c), rs2121783 located in the 
intron of FOX PI on 3p13 (P = 8.86 x 10~ 5 ) and rs3123688 on 
1 0pl 1 located between ZEB1 and ZNF438 upstream transcription 
start for both genes (P = 9.05 x 10~ 5 ; Table 1). 

In the extended meta-analysis, including both the Danish and 
German-Dutch individuals, four SNPs demonstrated P-values 
<1 x10~ 5 : rs12922317 located in the intron of RUNDC2 on 
16p13 (P = 9.04 x 10~ 7 ; Figure 1d), rs8057927 located in the intron 
ofCDH13 on 16q23 (P= 1.20 x 10" 6 ), rs6485671 located upstream 
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CREB3L1 on 1 1 pi 1 (P = 5.08 x 10~ 6 ) and rs4757144 in the intron of 
ARNTL (P = 5.35 x 10~ 6 ; Table 1). Additional results of the 
combined analyses can be found in Supplementary Table S6. 

Region-wise association analysis 

In total, 55,561 overlapping regions were tested for association with 
schizophrenia (Figure 2), requiring a region-based genome-wide 
significant level of P = 9.6 x 10" 7 if Bonferroni correction is applied. 
One region at chromosome 10p1 1:31,566,070-31,666,070 (hg18) 
overlapping ZEB1 was genome-wide significant (P=7.0x10~ 7 ; 
Figure 2, Supplementary Table S7). Five genotyped SNPs were 
located in this region (rs1314004, rs7083727, rs1314013, rsl 2242798 
and rs31 23688). Two of the SNPs are in high linkage disequilibrium 
(rs31 23688 and rsl 2242798, ^ = 0.642374). The five SNPs were 
genotyped in the stage 2 sample. However, rsl 2242798 failed geno- 
typing and was therefore imputed using the software MaCH 1.0 
(http://www.sph.umich.edu/csg/abecasis/MACH/) 44 with HapMap 
phase III, Release #2, CEU, as reference population. The resulting 
genotypes were imputed with good quality (quality = 0.99, 
Rsq = 0.34). Significant region-wise association was found in the 
stage 2 sample (P = 0.023), thereby establishing formal replication of 
this locus in the Danish population. 

SNP x maternal CMV infection interaction analysis 
Of the case mothers, 73.2% were CMV positive while that was the 
case for 70.7% of the control mothers, corresponding to an OR of 
1.13 (0.85 - 1.50), P = 0.39, for CMV with respect to schizophrenia 
in the offspring. A total of 29,082 SNPs passed step 1 inducing a 
Bonferroni significance level of P = 1 .72 x 1 0 ~ 6 at step 2. A single 
SNP, rs7902091 (MAF 0.16 in cases and 0.15 in controls) located in 
an intron of CTNNA3 on chromosome 10q21, demonstrated 
experiment-wide significant interaction with maternal CMV 
infection, with an interaction P-value of 7.3 x 10~ 7 and interaction 
OR of 5.3 under an additive genetic model (Figure 3, 
Supplementary Table S8). On its own, rs7902091 showed no 
association with schizophrenia (OR = 1.04, P = 0.67). For non- 
carriers of the minor allele, the risk of schizophrenia from maternal 
CMV was not observed (OR = 0.72, P = 0.11) whereas for carriers 
the risk increased to OR = 5.0 (P = 3.8 x 10~ 2 ). Furthermore, a 
neighboring SNP, rs7919083 located 2206 bp from rs7902091, 
demonstrated a relatively low interaction P-value (P = 5 x 10~ 4 ) 
practically independent of rs7902091 (^ = 0.08). 



DISCUSSION 

Here we report the results of a GWA study of schizophrenia using 
cases from a complete Danish birth cohort and follow-up 
investigations in additional samples, applying single variant and 
regional analyses, the latter identifying a novel locus at ZEB1. 
Moreover, conducting the first genome-wide gene-environment 
interaction survey in psychiatric disorders, we report significant 
interaction between CTNNA3 and maternal CMV infection. 

In the single variant analysis, none of the analyzed SNPs passed 
the widely accepted genome-wide significance threshold of P = 5 
x 10~ 8 . However, the results highlighted a number of loci with 
the strongest signals located on 1 0pl 1 , 1 1 pi 5, 16q23 and 1 6p1 3 
(Table 1). The SNP rs4757144, located in an intron of the circadian 
rhythm-associated gene ARNTL on 1 1 pi 5, demonstrated the 
strongest association in the combined analysis of Danish stage 1 
and stage 2 individuals and was the fourth most associated SNP in 
the extended meta-analysis. ARNTL is expressed in several 
regions of the human brain. 45 Circadian-rhythm abnormalities in 
schizophrenia patients have been reported 46 " 49 Several candidate 
gene studies have investigated the involvement of circadian 
genes in schizophrenia and other psychiatric disorders, with some 
suggesting the involvement of ARNTL in disease risk 50-52 and 
others not. 53 The SNP rs4757144 reported in this study is in high 
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Figure 1. (a) Manhattan plot of genome-wide association (GWA) analysis. The blue line indicates P = 1 x 10~ 4 . (b) Regional association plot of 
rs4757144 located mARNTL (c) Regional association plot of rs8057927 located in CDH13. (d) Regional association plot of rs1 292231 7 located in 
RUNDC2A. The P-values in green are from the GWA analysis, P-values marked in blue are from the combined analysis of Danish individuals, P- 
values marked in purple are from the extended meta-analysis. The linkage disequilibrium (LD; r 2 ) between the SNP in focus and its flanking 
markers genotyped in the GWA study are demonstrated in red (high LD) to white (low LD). The recombination rate is plotted in blue according 
to HapMap (CEU). 
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linkage disequilibrium (r 2 >0.8, HapMap release 22), with two 
SNPs previously reported to be associated with schizophrenia 
(rs1982350 51 ) and bipolar disorder (rs4757142, 51 rs1982350 52 ). 
ARNTL was also one of the top four candidate genes associated 
with bipolar disorder identified by convergent functional 
genomics data mining of existing GWAS data sets. 54 

The second most associated SNP in both the combined analysis 
of the Danish samples and in the extended meta-analysis was 
rs8057927 in the intron of CDH13 on 16q23. CDH13 encodes 
cadherin-13, a member of the cadherin super family of molecules 
that mediates Ca 2 + -dependent cell-cell adhesion in solid 
tissue. 55-57 CDH13 is expressed in several parts of the adult 
human brain 58 and appears to have a negative role in neural cell 
proliferation in the developing nervous system 58 " 60 The 
implication of CDH13 in other psychiatric disorders has been 
suggested. GWA studies of attention deficit/hyperactivity disorder 
identified CDH13 as one of the most associated genes, 61 " 64 and a 
meta-analysis of attention deficit/hyperactivity disorder linkage 
scans identified the region with CDH13 as the only genome-wide 
significant. 65 GWA studies have also indicated the involvement of 
CDH13 in depression 66 and autism 67 and a recent study 
implicated CNVs encompassing CDH13 in autism susceptibility 68 
The important role of cadherin-13 during brain development and 
in maintaining neural circuitry together with the reports of 
involvement of CDH13 in other psychiatric disorders therefore 
support our result, which, for the first time, suggests the 
involvement of CDH13 in schizophrenia (discussion of other top 
hits from the extended meta-analysis can be found in the 
Supplementary Material). 

One SNP (rsl 0828623) out of the 10 most associated SNPs in the 
combined analysis of Danish stage 1 and stage 2 individuals was 
nominal significantly associated with schizophrenia in the data 
from the Psychiatric Genomics Consortium (PGC; P^O.002), 11 
resulting in a P-value = 4.54 x 10~ 6 in the combined analysis of 
Danish stage 1 and 2 individuals and the PGC samples. The 
German-Dutch sample was included in the PGC data. Thus, there 
was no overlap in the discovery and follow-up samples. The 
limited replication could be due to genetic heterogeneity 
between the Danish and PGC samples, reducing the power to 
detect variants with small effects. The single marker loci 
demonstrating the strongest association in this study are 
therefore only valid for Danish, German and Dutch populations. 

Applying a regional analysis, summarizing independent signals 
in relatively small segments of overlapping regions of 100kb, we 
found region-based genome-wide significant association at a 
region on 1 0pl 1 containing ZEB1. The applied significance level 
was based on Bonferroni correction of the total number of 
analyzed regions, which is analogous to how the conventional 
GWAS threshold for single-SNP association of 5 x 1 0 ~ 8 is deduced 
but which in this case is conservative due to the regions being 
50% overlapping and therefore far from independent. This 
approach was able to identify a novel risk locus even though it 
was performed using a small sample compared with recent GWA 
studies, indicating that aggregating P-values in this fashion can be 
a powerful approach. This is supported by the accumulating 
observations of independent association signals from closely 
positioned SNPs (see for example, Steinberg et al. 4 and Ripke 
et a/. 11 ). Moreover, the region showed significant association in the 
Danish stage 2 sample, providing independent replication of this 
locus in the Danish population. No replication was attempted in 
the German-Dutch GWA data set or the PGC data as not all SNPs 
(or proxies) in this region were present in these data sets. ZEB1 
encodes an E-box binding zinc finger transcription factor, which is 
widely expressed in the central nervous system and has an 
important role in development of the brain 69 and neuronal 
differentiation. 70 The associated region includes the promoter of 
ZEB1 and could therefore be involved in or linked to variants 
involved in regulation of expression. This is intriguing as it has 
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been demonstrated that the expression of ZEB1 is regulated by 
another transcription factor protein, TCF4, 71 which is one of the 
best validated schizophrenia susceptibility genes. Two 
independent SNPs in this gene have passed the threshold for 
genome-wide significant association, 4,5,9 and several studies have 
found TCF4 SNPs demonstrating close to genome-wide 
significance. 5,11,12 Notably, studies have also found strong 
evidence for the involvement of both TCF4 and ZEB1 genetic 
variants in another disorder, namely Fuch's corneal dystrophy. 72,73 
Interestingly, ZEB1 is involved in regulation of cadherin-13 
expression by physically binding to the E2-box in the promotor 
region of CDH13 decreasing the expression of the gene; 74 
however, this finding still needs to be confirmed in nerve cells. 
Our results together with previous findings could therefore 
indicate that ZEB1, TCF4 and cadherin-13 are elements of a 
common pathway involved in schizophrenia. 

The interaction analysis of SNPs with maternal CMV infection 
found a significant interaction at CTNNA3, using an efficient two- 
step method where only SNPs passing step 1 were tested for 
interaction 43 Thus, only SNPs showing nominal association with 
CMV infection in the pooled sample of cases and controls were 
tested for interaction. This amounted to around 29,000 SNPs 
distributed across the genome. The interacting SNP was 
rs7902091 located in an intron of CTNNA3, just upstream the 
gene LRRTM3, which is nested within CTNNA3. CTNNA3 encodes 
catenin alpha-3, which is predominantly expressed in heart and 
testis but expression of the gene in the brain has also been 
demonstrated. 75 Catenin alpha-3 mediates cell-cell adhesion by 
functioning as a link between cadherin-based cell-cell adhesion 
complexes and the cytoskeleton. 76,77 Biologically the interaction of 
rs7902091 in CTNNA3 with maternal CMV makes sense, because 
CMV during infection may disrupt cell-to-cell connections by 
disconnecting the cadherin-catenin-actin complex within 
endothelial cells, 78 and in a study of human CMV in transgenic 
Drosophila, expression of the regulatory virus genes caused 
abnormal embryonic development by interfering with cell-to-cell 
adherens junctions through an effect on catenins. 79 The 
interaction observed suggests that the region around rs7902091 
in concert with maternal CMV infection may have a role in the 
etiology of schizophrenia. However, this should be replicated in 
additional studies. It is noteworthy, though, that neighboring SNPs 
(in particular rs7919083) showed a low interaction P-value 
independently of rs7902091, supporting the involvement of this 
locus. CTNNA3 and its nested gene LRRTM3 (encoding the Leucine- 
rich repeat transmembrane neuronal protein 3) have both 
previously been found associated to Alzheimer's disease 80-84 
and with autism spectrum disorder. 67,85 In relation to Alzheimer's 
disease, CTNNA3 have been observed to have stronger effect in 
females than in males. 82 We therefore performed a secondary 
analysis of gender differences in the interaction of CTNNA3 and 
CMV. The results are shown in the Supplementary Material 
(Supplementary Table S3). 

As in any other observational study involving environmental 
factors, confounding cannot be ruled out. With the apparent risk 
from CMV being turned on and off by the presence or absence of 
the variant, any confounder of CMV-schizophrenia association 
would confound the interaction result. For instance, the pre- 
valence of CMV infection has been reported to correlate with the 
prevalence of other infections, including other members of the 
Herpes family, 86 and social-economic status 87 that could be 
potential confounders. However, regardless of whether the CMV- 
schizophrenia association can be explained by confounding in 
part or completely, there is still interaction at the CTNNA3 
locus identifying sub-populations of different risk profiles for 
schizophrenia. 

We have reported the first GWA study and follow-up analysis of 
all the Danish individuals born since 1981 and diagnosed with 
schizophrenia up to 2010 and controls from the same birth cohort. 
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Furthermore, we have followed up in an additional sample from a 
genetically related population. The results support the findings 
from other GWA studies, suggesting the involvement of many 
common variants each contributing only slightly to disease risk. 
Applying a region-wise analysis, a new risk locus (at ZEB1) was 
identified and replicated. Several other plausible susceptibility loci 
were also suggested. This is also the first genome-wide study 
analyzing how maternal CMV infection interacts with the 
genotype of the progeny affecting the risk of schizophrenia, 
identifying a significant interaction at CTNNA3, a gene not 
previously implicated with schizophrenia. The result stresses the 
importance of including environmental factors in the evaluation of 
disease risk. Moreover, this is, to our knowledge, the first 
significant gene-environment interaction identified in a gen- 
ome-wide survey of a psychiatric disorder. Future studies should 
confirm the associations of the genomic regions with schizo- 
phrenia, demonstrating the strongest signals in this study, as well 
as enlarge, the inclusion of environmental factors when identify- 
ing genetic risk variants. The unique samples from the DNSB 
together with information from Danish register systems makes it 
possible to perform genetic studies with inclusion of a wealth of 
potential environmental risk factors. Future studies of the 
Danish population could therefore provide valuable insight 
into how gene-environment interactions influence the risk of 
schizophrenia. 
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